Members: Can Balkose, Zep Van Boxtel, Stan Vos, Julia Michels
Student numbers: 6068383 , 4903684 , 4725603 , 4996569
Requires data modeling and quantitative research in Transport, Infrastructure & Logistics
Research Question:
Effect of COVID on the transportation usage and mode of choice on different regions and demographics in the Netherlands.
Objectives
-To analyze and visualize the impact of the COVID-19 pandemic on transportation usage and mode choice in different regions within the Netherlands.
-What were the key demographic factors influencing transportation mode choice during the pandemic?
-Understanding how urban and rural cities were affected differently from the pandemic on transportation usage and mode of transportation
-To understand the change of behavior in different demographics on transportation after the pandemic. Coming up with a conclusion on the potential long-term impacts on transportation behavior post-pandemic
Be specific. Some of the tasks can be coding (expect everyone to do this), background research, conceptualisation, visualisation, data analysis, data modelling
Author 1:
Author 2:
Author 3:
import pandas as pd
from scipy.signal import find_peaks
from scipy.signal import argrelextrema
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
#data of distance covered in regions based on urbanization of region and mode of transport
data_distance_mode_urban= 'data/Distance_covered_on_different_urban_areas.csv'
df_urbanization_mode_urban = pd.read_csv(data_distance_mode_urban)
df_urbanization_mode_urban
| Year | Region | Mode of Transport | Total Distance (billion km) | |
|---|---|---|---|---|
| 0 | 2018 | Extremely urbanized | Combined | 46.8 |
| 1 | 2019 | Extremely urbanized | Combined | 46.2 |
| 2 | 2020 | Extremely urbanized | Combined | 31.4 |
| 3 | 2021 | Extremely urbanized | Combined | 36.3 |
| 4 | 2022 | Extremely urbanized | Combined | 40.9 |
| ... | ... | ... | ... | ... |
| 195 | 2018 | Not urbanized | Other | 2.0 |
| 196 | 2019 | Not urbanized | Other | 2.1 |
| 197 | 2020 | Not urbanized | Other | 1.5 |
| 198 | 2021 | Not urbanized | Other | 1.4 |
| 199 | 2022 | Not urbanized | Other | 1.3 |
200 rows × 4 columns
#data of usage of public transportation in different demographics
data_usage_of_public_transport= 'data/Usage_of_public_transportation.csv'
df_usage_of_public_transport = pd.read_csv(data_usage_of_public_transport)
df_usage_of_public_transport
| Demographic | Year | Usage of public transportation (%) | |
|---|---|---|---|
| 0 | Age: 12 to 17 years | 2018 | 11.7 |
| 1 | Age: 12 to 17 years | 2019 | 10.7 |
| 2 | Age: 12 to 17 years | 2020 | 6.3 |
| 3 | Age: 12 to 17 years | 2021 | 6.3 |
| 4 | Age: 12 to 17 years | 2022 | 9.2 |
| ... | ... | ... | ... |
| 100 | No driver's license; 17 years or older | 2018 | 17.5 |
| 101 | No driver's license; 17 years or older | 2019 | 16.3 |
| 102 | No driver's license; 17 years or older | 2020 | 8.5 |
| 103 | No driver's license; 17 years or older | 2021 | 9.8 |
| 104 | No driver's license; 17 years or older | 2022 | 13.7 |
105 rows × 3 columns
#the amount of traffix on dutch highway on weekdays and weekends compared to 2019 (2019 = 100)
data_traffic_highways = 'data/CBS Dutch highway traffic.csv'
df_data_traffic_highways = pd.read_csv(data_traffic_highways)
df_data_traffic_highways = df_data_traffic_highways.iloc[:-3]
df_data_traffic_highways
| Week | Doordeweeks, 2020 (2019 = 100) | In het weekeinde, 2020 (2019 = 100) | Doordeweeks, 2021 (2019 = 100) | In het weekeinde, 2021 (2019 = 100) | Doordeweeks, 2022 (2019 = 100) | In het weekeinde, 2022 (2019 = 100) | Doordeweeks, 2023 (2019 = 100) | In het weekeinde, 2023 (2019 = 100) | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 83.0 | 101.0 | 71.0 | 67.0 | 96.0 | 82.0 | 103.0 | 99.0 |
| 1 | 2 | 99.0 | 102.0 | 79.0 | 64.0 | 86.0 | 86.0 | 93.0 | 98.0 |
| 2 | 3 | 100.0 | 102.0 | 77.0 | 65.0 | 85.0 | 84.0 | 91.0 | 95.0 |
| 3 | 4 | 104.0 | 106.0 | 78.0 | 67.0 | 88.0 | 91.0 | 98.0 | 103.0 |
| 4 | 5 | 102.0 | 103.0 | 78.0 | 48.0 | 87.0 | 86.0 | 95.0 | 100.0 |
| 5 | 6 | 99.0 | 88.0 | 62.0 | 61.0 | 87.0 | 91.0 | 92.0 | 99.0 |
| 6 | 7 | 97.0 | 90.0 | 73.0 | 68.0 | 82.0 | 82.0 | 93.0 | 85.0 |
| 7 | 8 | 99.0 | 87.0 | 80.0 | 68.0 | 88.0 | 89.0 | 92.0 | 95.0 |
| 8 | 9 | 94.0 | 105.0 | 78.0 | 74.0 | 86.0 | 92.0 | 91.0 | 104.0 |
| 9 | 10 | 98.0 | 99.0 | 80.0 | 63.0 | 88.0 | 89.0 | 93.0 | 96.0 |
| 10 | 11 | 91.0 | 67.0 | 80.0 | 71.0 | 88.0 | 91.0 | 94.0 | 99.0 |
| 11 | 12 | 60.0 | 38.0 | 80.0 | 68.0 | 89.0 | 92.0 | 93.0 | 91.0 |
| 12 | 13 | 51.0 | 33.0 | 80.0 | 67.0 | 87.0 | 85.0 | 92.0 | 93.0 |
| 13 | 14 | 52.0 | 33.0 | 77.0 | 60.0 | 88.0 | 84.0 | 95.0 | 88.0 |
| 14 | 15 | 52.0 | 35.0 | 76.0 | 65.0 | 90.0 | 87.0 | 89.0 | 94.0 |
| 15 | 16 | 47.0 | 39.0 | 76.0 | 67.0 | 86.0 | 92.0 | 89.0 | 92.0 |
| 16 | 17 | 58.0 | 53.0 | 75.0 | 85.0 | 84.0 | 107.0 | 85.0 | 113.0 |
| 17 | 18 | 56.0 | 49.0 | 81.0 | 80.0 | 89.0 | 103.0 | 91.0 | 99.0 |
| 18 | 19 | 61.0 | 57.0 | 77.0 | 75.0 | 93.0 | 93.0 | 94.0 | 96.0 |
| 19 | 20 | 66.0 | 57.0 | 83.0 | 74.0 | 89.0 | 91.0 | 87.0 | 100.0 |
| 20 | 21 | 64.0 | 61.0 | 78.0 | 80.0 | 86.0 | 97.0 | 93.0 | 96.0 |
| 21 | 22 | 78.0 | 62.0 | 89.0 | 75.0 | 98.0 | 83.0 | 97.0 | 87.0 |
| 22 | 23 | 73.0 | 70.0 | 84.0 | 88.0 | 88.0 | 98.0 | 93.0 | 104.0 |
| 23 | 24 | 81.0 | 73.0 | 87.0 | 84.0 | 94.0 | 95.0 | 94.0 | 96.0 |
| 24 | 25 | 81.0 | 82.0 | 86.0 | 84.0 | 90.0 | 92.0 | 91.0 | 97.0 |
| 25 | 26 | 84.0 | 81.0 | 87.0 | 88.0 | 91.0 | 93.0 | 91.0 | 93.0 |
| 26 | 27 | 86.0 | 84.0 | 88.0 | 90.0 | 90.0 | 93.0 | 91.0 | 95.0 |
| 27 | 28 | 86.0 | 89.0 | 86.0 | 87.0 | 91.0 | 94.0 | 93.0 | 95.0 |
| 28 | 29 | 89.0 | 95.0 | 88.0 | 88.0 | 89.0 | 94.0 | 92.0 | 95.0 |
| 29 | 30 | 93.0 | 95.0 | 89.0 | 88.0 | 93.0 | 97.0 | NaN | NaN |
| 30 | 31 | 93.0 | 91.0 | 88.0 | 87.0 | 92.0 | 93.0 | NaN | NaN |
| 31 | 32 | 91.0 | 90.0 | 90.0 | 94.0 | 91.0 | 93.0 | NaN | NaN |
| 32 | 33 | 88.0 | 91.0 | 90.0 | 96.0 | 90.0 | 98.0 | NaN | NaN |
| 33 | 34 | 90.0 | 84.0 | 91.0 | 87.0 | 92.0 | 92.0 | NaN | NaN |
| 34 | 35 | 90.0 | 86.0 | 91.0 | 90.0 | 94.0 | 91.0 | NaN | NaN |
| 35 | 36 | 92.0 | 92.0 | 94.0 | 90.0 | 94.0 | 93.0 | NaN | NaN |
| 36 | 37 | 90.0 | 92.0 | 92.0 | 96.0 | 92.0 | 89.0 | NaN | NaN |
| 37 | 38 | 92.0 | 89.0 | 94.0 | 93.0 | 91.0 | 91.0 | NaN | NaN |
| 38 | 39 | 89.0 | 74.0 | 93.0 | 93.0 | 91.0 | 80.0 | NaN | NaN |
| 39 | 40 | 88.0 | 75.0 | 96.0 | 97.0 | 96.0 | 94.0 | NaN | NaN |
| 40 | 41 | 82.0 | 76.0 | 92.0 | 98.0 | 91.0 | 93.0 | NaN | NaN |
| 41 | 42 | 81.0 | 70.0 | 93.0 | 98.0 | 91.0 | 94.0 | NaN | NaN |
| 42 | 43 | 79.0 | 68.0 | 92.0 | 90.0 | 94.0 | 97.0 | NaN | NaN |
| 43 | 44 | 77.0 | 66.0 | 90.0 | 85.0 | 91.0 | 86.0 | NaN | NaN |
| 44 | 45 | 78.0 | 68.0 | 90.0 | 83.0 | 93.0 | 94.0 | NaN | NaN |
| 45 | 46 | 79.0 | 69.0 | 85.0 | 83.0 | 94.0 | 96.0 | NaN | NaN |
| 46 | 47 | 81.0 | 72.0 | 86.0 | 80.0 | 93.0 | 97.0 | NaN | NaN |
| 47 | 48 | 82.0 | 75.0 | 81.0 | 76.0 | 91.0 | 88.0 | NaN | NaN |
| 48 | 49 | 83.0 | 73.0 | 85.0 | 80.0 | 92.0 | 95.0 | NaN | NaN |
| 49 | 50 | 82.0 | 72.0 | 85.0 | 79.0 | 92.0 | 90.0 | NaN | NaN |
| 50 | 51 | 76.0 | 63.0 | 78.0 | 72.0 | 87.0 | 79.0 | NaN | NaN |
| 51 | 52 | 83.0 | 59.0 | 89.0 | 61.0 | 105.0 | 96.0 | NaN | NaN |
| 52 | 53 | 86.0 | 65.0 | NaN | NaN | NaN | NaN | NaN | NaN |
data_2018_2022 = 'data/2018_2022.csv'
df_data_2018_2022 = pd.read_csv(data_2018_2022)
df_data_2018_2022
| Year | Demographic | Urbanization | Trips | Travel distance | Travel time | |
|---|---|---|---|---|---|---|
| 0 | 2018 | Age: 18 to 24 years | Extremely urbanised | 1018 | 13747 | 530.4 |
| 1 | 2018 | Age: 25 to 34 years | Extremely urbanised | 1020 | 15660 | 536.8 |
| 2 | 2018 | Age: 35 to 49 years | Extremely urbanised | 1105 | 13138 | 492.2 |
| 3 | 2018 | Age: 50 to 64 years | Extremely urbanised | 990 | 12828 | 480.1 |
| 4 | 2018 | Age: 65 to 74 years | Extremely urbanised | 864 | 9695 | 443.8 |
| ... | ... | ... | ... | ... | ... | ... |
| 433 | 2022 | No driver's license, 17 years or older | Extremely urbanised | 706 | 6068 | 400.2 |
| 434 | 2022 | No driver's license, 17 years or older | Strongly urbanised | 753 | 7465 | 391.6 |
| 435 | 2022 | No driver's license, 17 years or older | Moderately urbanised | 724 | 6877 | 377.2 |
| 436 | 2022 | No driver's license, 17 years or older | Hardly urbanised | 702 | 7676 | 349.6 |
| 437 | 2022 | No driver's license, 17 years or older | Not urbanised | 646 | 7450 | 350.2 |
438 rows × 6 columns
#filter out the rows where mode of transport is 'combined'
filtered_df_urbanization_mode_urban = df_urbanization_mode_urban[df_urbanization_mode_urban["Mode of Transport"] != 'Combined']
The pie charts representing distance covered by mode of transport in the Netherlands from 2018 to 2022 likely show a decline in public transport usage during the time of the pandemic. While public transport usage did recover somewhat, it did not return to levels of the pre-pandemic. This suggests that the fear and uncertainty surrounding the virus had an impact on public transport usage.
#pie chart to visualise the distance covered by mode of transport per year
years_to_visualize = [2018, 2019, 2020, 2021, 2022]
for year in years_to_visualize:
df_year = filtered_df_urbanization_mode_urban[filtered_df_urbanization_mode_urban['Year'] == year]
mode_distance = df_year.groupby('Mode of Transport')['Total Distance (billion km)'].sum()
plt.figure(figsize=(3, 3))
plt.pie(mode_distance, labels=mode_distance.index, autopct='%1.1f%%', startangle=140)
plt.title(f'Distance Covered by Mode of Transport in {year}')
plt.axis('equal')
plt.show()
# For the data for the Equivalised income groups
filtered_income_df_usage_of_public_transport = df_usage_of_public_transport[df_usage_of_public_transport['Demographic'].str.contains('Equ')]
Before the pandemic, public transport was a choice for many higher income individuals in the Netherlands. With the pandemic's start, public transport saw a significant drop in usage, possibly due to health concerns linked to crowded spaces. This decline affected society broadly, but higher income groups experienced a more substantial decrease in their use of public transport as seen in the animation.
Even after the affects of the pandemic dropped in 2022, it could be observed that the usage of public transport in higher income groups has still not recovered as much as the lower income groups.
fig = px.bar(
filtered_income_df_usage_of_public_transport,
x="Demographic",
y="Usage of public transportation (%)",
color='Demographic',
animation_frame="Year",
range_y=[0, 20],
title="Usage of Public Transportation Over Years",
labels={"Usage of public transportation (%)": "Usage (%)"},
)
fig.update_xaxes(categoryorder='total descending')
fig.show()
Another interesting demographic to look at was age groups. As seen in the animation below, younger age groups usage of public transport have recovered much better than older age groups who could endure the virus more severely. It is also seen that younger age groups, who are expected have lower incomes, tend to use public transport more than older age groups, which is correlated to the animation above.
# For the data for age
filtered_age_df_usage_of_public_transport = df_usage_of_public_transport[df_usage_of_public_transport['Demographic'].str.contains('Age')]
fig = px.bar(
filtered_age_df_usage_of_public_transport,
x="Demographic",
y="Usage of public transportation (%)",
color='Demographic',
animation_frame="Year",
range_y=[0, 20],
title="Usage of Public Transportation Over Years",
labels={"Usage of public transportation (%)": "Usage (%)"},
)
fig.update_xaxes(categoryorder='total descending')
fig.show()
The graph below shows the usage of public transportation over years by driver license and car ownership. The study shows people with no drivers licenses had the largest bounce back to using public transporation again, which indicates that people with driving licenses still refrain from the usage of public transport post-pandemic.
filtered_driver_license_df_usage_of_public_transport = df_usage_of_public_transport[df_usage_of_public_transport['Demographic'].str.contains('river')]
fig = px.line(
filtered_driver_license_df_usage_of_public_transport,
x="Year",
y="Usage of public transportation (%)",
color="Demographic",
title="Usage of Public Transportation Over Years by Driver License and Car Ownership",
labels={"Usage of public transportation (%)": "Usage (%)"},
markers=True
)
desired_years = [2018, 2019, 2020, 2021, 2022]
years = [str(year) for year in desired_years]
fig.update_xaxes(tickvals=years,ticktext=years)
fig.show()
The graphs depicting distance covered in different urban areas of the Netherlands from 2018 to 2022 demonstrate a clear and consistent decline in transportation usage. The level of urbanization did not appear to have a significant correlation with the decline in transportation usage, suggesting that individual caractheristics of people played a more critical role in shaping mobility patterns rather than location.
df = pd.DataFrame(df_urbanization_mode_urban)
sns.set_style("whitegrid")
plt.figure(figsize=(12, 6))
sns.lineplot(data=df, x="Year", y="Total Distance (billion km)", hue="Region", marker="o", palette=sns.color_palette("hsv", len(df['Region'].unique())), errorbar=None)
plt.title("Effects of COVID-19 on Distance Traveled in Different Regions")
plt.xlabel("Year")
plt.ylabel("Total Distance (billion km)")
plt.legend(title="Region", loc='center left', bbox_to_anchor=(1, 0.5))
plt.grid(True)
plt.show()
sns.set_style("whitegrid")
plt.figure(figsize=(12, 6))
sns.lineplot(data=df_data_2018_2022, x="Year", y="Travel time", style="Urbanization", markers=True, dashes=False,errorbar=None)
plt.title("Travel Time per Region")
plt.xlabel("Year")
plt.ylabel("Travel time")
plt.legend(title="Demographic", loc='center left', bbox_to_anchor=(1, 0.5))
plt.grid(True)
plt.show()
sns.set_style("whitegrid")
plt.figure(figsize=(12, 6))
for column in df_data_traffic_highways.columns[1:]:
sns.lineplot(x="Week", y=column, data=df_data_traffic_highways, label=column)
plt.legend(loc="upper right")
plt.xlabel("Week")
plt.ylabel("Value (2019 = 100)")
plt.title("Traffic Data Over Weeks")
plt.show()